Overview

Dataset info

Number of variables40
Number of observations1000
Missing cells1000 (2.5%)
Duplicate rows0 (0.0%)
Total size in memory1.4 MiB
Average record size in memory1.5 KiB

Variables types

CAT24
NUM14
UNSUPPORTED1
BOOL1

Reproduction info

Date of analysis2020-03-03 10:47:19.666311
Versionpandas-profiling v2.4.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download Configurationconfig.yaml

Warnings

_c39 has 1000 (100.0%) missing values Missing
_c39 is an unsupported type, check if it needs cleaning or further analysis Warning
capital-gains has 508 (50.8%) zeros Zeros
capital-loss has 475 (47.5%) zeros Zeros
incident_date only contains datetime values, but is categorical. Consider applying pd.to_datetime()Type
incident_date has a high cardinality: 60 distinct values Warning
incident_hour_of_the_day has 52 (5.2%) zeros Zeros
incident_location has a high cardinality: 1000 distinct values Warning
injury_claim has 25 (2.5%) zeros Zeros
policy_bind_date only contains datetime values, but is categorical. Consider applying pd.to_datetime()Type
policy_bind_date has a high cardinality: 951 distinct values Warning
property_claim has 19 (1.9%) zeros Zeros
umbrella_limit has 798 (79.8%) zeros Zeros
months_as_customer is highly correlated with ageHigh Correlation
age is highly correlated with months_as_customerHigh Correlation
vehicle_claim is highly correlated with total_claim_amountHigh Correlation
total_claim_amount is highly correlated with vehicle_claimHigh Correlation
auto_model is highly correlated with auto_makeHigh Correlation
auto_make is highly correlated with auto_modelHigh Correlation

Variables

_c39
Unsupported

MISSING
UNSUPPORTED
Missing1000
Missing (%)100.0%
Memory size7.9 KiB

age
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count46
Unique (%)4.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.948
Minimum19
Maximum64
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB
Mini histogram

Quantile statistics

Minimum19
5-th percentile26
Q132
median38
Q344
95-th percentile57
Maximum64
Range45
Interquartile range (IQR)12

Descriptive statistics

Standard deviation9.140286694
Coefficient of variation (CV)0.2346792311
Kurtosis-0.260255015
Mean38.948
Median Absolute Deviation (MAD)7.367272
Skewness0.4789880471
Sum38948
Variance83.54484084
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[19. 22.5 25.5 28.5 43.5 48.5 61.5 64. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
43 49 4.9%
 
39 48 4.8%
 
41 45 4.5%
 
34 44 4.4%
 
30 42 4.2%
 
31 42 4.2%
 
38 42 4.2%
 
37 41 4.1%
 
33 39 3.9%
 
32 38 3.8%
 
Other values (36) 570 57.0%
 
ValueCountFrequency (%) 
19 1 0.1%
 
20 1 0.1%
 
21 6 0.6%
 
22 1 0.1%
 
23 7 0.7%
 
ValueCountFrequency (%) 
64 2 0.2%
 
63 2 0.2%
 
62 4 0.4%
 
61 10 1.0%
 
60 9 0.9%
 
Distinct count5
Unique (%)0.5%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Police
292
Fire
223
Other
198
Ambulance
196
None
91
ValueCountFrequency (%) 
Police 292 29.2%
 
Fire 223 22.3%
 
Other 198 19.8%
 
Ambulance 196 19.6%
 
None 91 9.1%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceFalse
Contains non-wordsFalse

Length

Max length9
Mean length5.762
Min length4
Scatter

auto_make
Categorical

HIGH CORRELATION
Distinct count14
Unique (%)1.4%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Dodge
 
80
Suburu
 
80
Saab
 
80
Nissan
 
78
Chevrolet
 
76
Other values (9)
606
ValueCountFrequency (%) 
Dodge 80 8.0%
 
Suburu 80 8.0%
 
Saab 80 8.0%
 
Nissan 78 7.8%
 
Chevrolet 76 7.6%
 
Ford 72 7.2%
 
BMW 72 7.2%
 
Toyota 70 7.0%
 
Audi 69 6.9%
 
Volkswagen 68 6.8%
 
Other values (4) 255 25.5%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceFalse
Contains non-wordsFalse

Length

Max length10
Mean length5.703
Min length3
Scatter

auto_model
Categorical

HIGH CORRELATION
Distinct count39
Unique (%)3.9%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
RAM
 
43
Wrangler
 
42
Neon
 
37
A3
 
37
MDX
 
36
Other values (34)
805
ValueCountFrequency (%) 
RAM 43 4.3%
 
Wrangler 42 4.2%
 
Neon 37 3.7%
 
A3 37 3.7%
 
MDX 36 3.6%
 
Jetta 35 3.5%
 
Passat 33 3.3%
 
A5 32 3.2%
 
Legacy 32 3.2%
 
Pathfinder 31 3.1%
 
Other values (29) 642 64.2%
 

Composition

Contains charsTrue
Contains digitsTrue
Contains whitespaceTrue
Contains non-wordsTrue

Length

Max length14
Mean length5.178
Min length2
Scatter

auto_year
Real number (ℝ≥0)

Distinct count21
Unique (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2005.103
Minimum1995
Maximum2015
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB
Mini histogram

Quantile statistics

Minimum1995
5-th percentile1995
Q12000
median2005
Q32010
95-th percentile2014
Maximum2015
Range20
Interquartile range (IQR)10

Descriptive statistics

Standard deviation6.015860835
Coefficient of variation (CV)0.003000275215
Kurtosis-1.171867756
Mean2005.103
Median Absolute Deviation (MAD)5.179266
Skewness-0.04828880711
Sum2005103
Variance36.19058158
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[1995. 1995.5 2014.5 2015. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
1995 56 5.6%
 
1999 55 5.5%
 
2005 54 5.4%
 
2011 53 5.3%
 
2006 53 5.3%
 
2007 52 5.2%
 
2003 51 5.1%
 
2010 50 5.0%
 
2009 50 5.0%
 
2013 49 4.9%
 
Other values (11) 477 47.7%
 
ValueCountFrequency (%) 
1995 56 5.6%
 
1996 37 3.7%
 
1997 46 4.6%
 
1998 40 4.0%
 
1999 55 5.5%
 
ValueCountFrequency (%) 
2015 47 4.7%
 
2014 44 4.4%
 
2013 49 4.9%
 
2012 46 4.6%
 
2011 53 5.3%
 

bodily_injuries
Categorical

Distinct count3
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
0
340
2
332
1
328
ValueCountFrequency (%) 
0 340 34.0%
 
2 332 33.2%
 
1 328 32.8%
 

Composition

Contains charsFalse
Contains digitsTrue
Contains whitespaceFalse
Contains non-wordsFalse

Length

Max length1
Mean length1
Min length1
Scatter

capital-gains
Real number (ℝ≥0)

ZEROS
Distinct count338
Unique (%)33.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25126.1
Minimum0
Maximum100500
Zeros508
Zeros (%)50.8%
Memory size7.9 KiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q351025
95-th percentile70615
Maximum100500
Range100500
Interquartile range (IQR)51025

Descriptive statistics

Standard deviation27872.18771
Coefficient of variation (CV)1.109292238
Kurtosis-1.276703511
Mean25126.1
Median Absolute Deviation (MAD)25842.7704
Skewness0.4788502296
Sum25126100
Variance776858847.6
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[ 0. 400. 23000. 34250. 71450. 84400. 100500.], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 508 50.8%
 
46300 5 0.5%
 
68500 4 0.4%
 
51500 4 0.4%
 
48900 3 0.3%
 
51100 3 0.3%
 
52600 3 0.3%
 
47600 3 0.3%
 
51700 3 0.3%
 
63600 3 0.3%
 
Other values (328) 461 46.1%
 
ValueCountFrequency (%) 
0 508 50.8%
 
800 1 0.1%
 
10000 1 0.1%
 
11000 1 0.1%
 
12100 1 0.1%
 
ValueCountFrequency (%) 
100500 1 0.1%
 
98800 1 0.1%
 
94800 1 0.1%
 
91900 1 0.1%
 
90700 1 0.1%
 

capital-loss
Real number (ℝ)

ZEROS
Distinct count354
Unique (%)35.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-26793.7
Minimum-111100
Maximum0
Zeros475
Zeros (%)47.5%
Memory size7.9 KiB
Mini histogram

Quantile statistics

Minimum-111100
5-th percentile-72305
Q1-51500
median-23250
Q30
95-th percentile0
Maximum0
Range111100
Interquartile range (IQR)51500

Descriptive statistics

Standard deviation28104.09669
Coefficient of variation (CV)-1.048906896
Kurtosis-1.3138745
Mean-26793.7
Median Absolute Deviation (MAD)25976.2244
Skewness-0.391471943
Sum-26793700
Variance789840250.6
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[-111100. -92500. -75050. -61550. -39250. -30050. -19050. -2850. 0.], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 475 47.5%
 
-53700 5 0.5%
 
-50300 5 0.5%
 
-31700 5 0.5%
 
-49200 4 0.4%
 
-61400 4 0.4%
 
-53800 4 0.4%
 
-51000 4 0.4%
 
-31400 4 0.4%
 
-45300 4 0.4%
 
Other values (344) 486 48.6%
 
ValueCountFrequency (%) 
-111100 1 0.1%
 
-93600 1 0.1%
 
-91400 1 0.1%
 
-91200 1 0.1%
 
-90600 1 0.1%
 
ValueCountFrequency (%) 
0 475 47.5%
 
-5700 1 0.1%
 
-6300 1 0.1%
 
-8500 1 0.1%
 
-10600 1 0.1%
 

collision_type
Categorical

Distinct count4
Unique (%)0.4%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Rear Collision
292
Side Collision
276
Front Collision
254
?
178
ValueCountFrequency (%) 
Rear Collision 292 29.2%
 
Side Collision 276 27.6%
 
Front Collision 254 25.4%
 
? 178 17.8%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceTrue
Contains non-wordsTrue

Length

Max length15
Mean length11.94
Min length1
Scatter
Distinct count2
Unique (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
N
753
Y
247
ValueCountFrequency (%) 
N 753 75.3%
 
Y 247 24.7%
 

incident_city
Categorical

Distinct count7
Unique (%)0.7%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Springfield
157
Arlington
152
Columbus
149
Northbend
145
Hillsdale
141
Other values (2)
256
ValueCountFrequency (%) 
Springfield 157 15.7%
 
Arlington 152 15.2%
 
Columbus 149 14.9%
 
Northbend 145 14.5%
 
Hillsdale 141 14.1%
 
Riverwood 134 13.4%
 
Northbrook 122 12.2%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceFalse
Contains non-wordsFalse

Length

Max length11
Mean length9.287
Min length8
Scatter

incident_date
Categorical

TYPE DATE
HIGH CARDINALITY
Distinct count60
Unique (%)6.0%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
2015-02-02
 
28
2015-02-17
 
26
2015-01-07
 
25
2015-01-10
 
24
2015-02-04
 
24
Other values (55)
873
ValueCountFrequency (%) 
2015-02-02 28 2.8%
 
2015-02-17 26 2.6%
 
2015-01-07 25 2.5%
 
2015-01-10 24 2.4%
 
2015-02-04 24 2.4%
 
2015-01-24 24 2.4%
 
2015-01-19 23 2.3%
 
2015-01-08 22 2.2%
 
2015-01-13 21 2.1%
 
2015-01-30 21 2.1%
 
Other values (50) 762 76.2%
 

Composition

Contains charsFalse
Contains digitsTrue
Contains whitespaceFalse
Contains non-wordsTrue

Length

Max length10
Mean length10
Min length10
Scatter

incident_hour_of_the_day
Real number (ℝ≥0)

ZEROS
Distinct count24
Unique (%)2.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.644
Minimum0
Maximum23
Zeros52
Zeros (%)5.2%
Memory size7.9 KiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q16
median12
Q317
95-th percentile23
Maximum23
Range23
Interquartile range (IQR)11

Descriptive statistics

Standard deviation6.951372928
Coefficient of variation (CV)0.5969918351
Kurtosis-1.192940152
Mean11.644
Median Absolute Deviation (MAD)6.032104
Skewness-0.03558446644
Sum11644
Variance48.32158559
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[ 0. 0.5 22.5 23. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
17 54 5.4%
 
3 53 5.3%
 
0 52 5.2%
 
23 51 5.1%
 
16 49 4.9%
 
4 46 4.6%
 
13 46 4.6%
 
10 46 4.6%
 
6 44 4.4%
 
9 43 4.3%
 
Other values (14) 516 51.6%
 
ValueCountFrequency (%) 
0 52 5.2%
 
1 29 2.9%
 
2 31 3.1%
 
3 53 5.3%
 
4 46 4.6%
 
ValueCountFrequency (%) 
23 51 5.1%
 
22 38 3.8%
 
21 42 4.2%
 
20 34 3.4%
 
19 40 4.0%
 

incident_location
Categorical

UNIQUE
HIGH CARDINALITY
Distinct count1000
Unique (%)100.0%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
1491 Francis Ridge
 
1
9737 Solo Hwy
 
1
4629 Elm Ridge
 
1
5812 3rd Hwy
 
1
7877 Sky Lane
 
1
Other values (995)
995
ValueCountFrequency (%) 
1491 Francis Ridge 1 0.1%
 
9737 Solo Hwy 1 0.1%
 
4629 Elm Ridge 1 0.1%
 
5812 3rd Hwy 1 0.1%
 
7877 Sky Lane 1 0.1%
 
4876 Washington Drive 1 0.1%
 
5806 Embaracadero St 1 0.1%
 
2204 Washington Lane 1 0.1%
 
1331 Britain Hwy 1 0.1%
 
4965 MLK Drive 1 0.1%
 
Other values (990) 990 99.0%
 

Composition

Contains charsTrue
Contains digitsTrue
Contains whitespaceTrue
Contains non-wordsTrue

Length

Max length23
Mean length14.749
Min length11
Scatter
Distinct count4
Unique (%)0.4%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Minor Damage
354
Total Loss
280
Major Damage
276
Trivial Damage
90
ValueCountFrequency (%) 
Minor Damage 354 35.4%
 
Total Loss 280 28.0%
 
Major Damage 276 27.6%
 
Trivial Damage 90 9.0%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceTrue
Contains non-wordsTrue

Length

Max length14
Mean length11.62
Min length10
Scatter

incident_state
Categorical

Distinct count7
Unique (%)0.7%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
NY
262
SC
248
WV
217
VA
110
NC
110
Other values (2)
53
ValueCountFrequency (%) 
NY 262 26.2%
 
SC 248 24.8%
 
WV 217 21.7%
 
VA 110 11.0%
 
NC 110 11.0%
 
PA 30 3.0%
 
OH 23 2.3%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceFalse
Contains non-wordsFalse

Length

Max length2
Mean length2
Min length2
Scatter

incident_type
Categorical

Distinct count4
Unique (%)0.4%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Multi-vehicle Collision
419
Single Vehicle Collision
403
Vehicle Theft
94
Parked Car
84
ValueCountFrequency (%) 
Multi-vehicle Collision 419 41.9%
 
Single Vehicle Collision 403 40.3%
 
Vehicle Theft 94 9.4%
 
Parked Car 84 8.4%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceTrue
Contains non-wordsTrue

Length

Max length24
Mean length21.371
Min length10
Scatter

injury_claim
Real number (ℝ≥0)

ZEROS
Distinct count638
Unique (%)63.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7433.42
Minimum0
Maximum21450
Zeros25
Zeros (%)2.5%
Memory size7.9 KiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile450
Q14295
median6775
Q311305
95-th percentile15662
Maximum21450
Range21450
Interquartile range (IQR)7010

Descriptive statistics

Standard deviation4880.951853
Coefficient of variation (CV)0.6566226385
Kurtosis-0.7630870611
Mean7433.42
Median Absolute Deviation (MAD)4008.72512
Skewness0.2648108785
Sum7433420
Variance23823690.99
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[0.000e+00 5.000e+00 2.650e+02 4.150e+02 6.900e+02 ... 4.980e+03 7.765e+03 1.568e+04 1.820e+04 2.145e+04], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 25 2.5%
 
480 7 0.7%
 
640 7 0.7%
 
580 5 0.5%
 
6340 5 0.5%
 
660 5 0.5%
 
780 5 0.5%
 
13520 5 0.5%
 
860 5 0.5%
 
1180 5 0.5%
 
Other values (628) 926 92.6%
 
ValueCountFrequency (%) 
0 25 2.5%
 
10 1 0.1%
 
220 1 0.1%
 
250 1 0.1%
 
280 2 0.2%
 
ValueCountFrequency (%) 
21450 1 0.1%
 
21330 1 0.1%
 
20700 1 0.1%
 
19020 1 0.1%
 
18520 1 0.1%
 
Distinct count7
Unique (%)0.7%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
JD
161
High School
160
Associate
145
MD
144
Masters
143
Other values (2)
247
ValueCountFrequency (%) 
JD 161 16.1%
 
High School 160 16.0%
 
Associate 145 14.5%
 
MD 144 14.4%
 
Masters 143 14.3%
 
PhD 125 12.5%
 
College 122 12.2%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceTrue
Contains non-wordsTrue

Length

Max length11
Mean length5.905
Min length2
Scatter

insured_hobbies
Categorical

Distinct count20
Unique (%)2.0%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
reading
 
64
paintball
 
57
exercise
 
57
bungie-jumping
 
56
camping
 
55
Other values (15)
711
ValueCountFrequency (%) 
reading 64 6.4%
 
paintball 57 5.7%
 
exercise 57 5.7%
 
bungie-jumping 56 5.6%
 
camping 55 5.5%
 
movies 55 5.5%
 
golf 55 5.5%
 
kayaking 54 5.4%
 
yachting 53 5.3%
 
hiking 52 5.2%
 
Other values (10) 442 44.2%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceFalse
Contains non-wordsTrue

Length

Max length14
Mean length8.113
Min length4
Scatter
Distinct count14
Unique (%)1.4%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
machine-op-inspct
 
93
prof-specialty
 
85
tech-support
 
78
sales
 
76
exec-managerial
 
76
Other values (9)
592
ValueCountFrequency (%) 
machine-op-inspct 93 9.3%
 
prof-specialty 85 8.5%
 
tech-support 78 7.8%
 
sales 76 7.6%
 
exec-managerial 76 7.6%
 
craft-repair 74 7.4%
 
transport-moving 72 7.2%
 
priv-house-serv 71 7.1%
 
other-service 71 7.1%
 
armed-forces 69 6.9%
 
Other values (4) 235 23.5%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceFalse
Contains non-wordsTrue

Length

Max length17
Mean length13.521
Min length5
Scatter
Distinct count6
Unique (%)0.6%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
own-child
183
other-relative
177
not-in-family
174
husband
170
wife
155
ValueCountFrequency (%) 
own-child 183 18.3%
 
other-relative 177 17.7%
 
not-in-family 174 17.4%
 
husband 170 17.0%
 
wife 155 15.5%
 
unmarried 141 14.1%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceFalse
Contains non-wordsTrue

Length

Max length14
Mean length9.466
Min length4
Scatter

insured_sex
Categorical

Distinct count2
Unique (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
FEMALE
537
MALE
463
ValueCountFrequency (%) 
FEMALE 537 53.7%
 
MALE 463 46.3%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceFalse
Contains non-wordsFalse

Length

Max length6
Mean length5.074
Min length4
Scatter

insured_zip
Real number (ℝ≥0)

Distinct count995
Unique (%)99.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean501214.488
Minimum430104
Maximum620962
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB
Mini histogram

Quantile statistics

Minimum430104
5-th percentile433273.75
Q1448404.5
median466445.5
Q3603251
95-th percentile617463.35
Maximum620962
Range190858
Interquartile range (IQR)154846.5

Descriptive statistics

Standard deviation71701.61094
Coefficient of variation (CV)0.1430557429
Kurtosis-1.190711054
Mean501214.488
Median Absolute Deviation (MAD)64266.73706
Skewness0.8165539259
Sum501214488
Variance5141121012
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[430104. 479882.5 600140. 620962. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
446895 2 0.2%
 
456602 2 0.2%
 
477695 2 0.2%
 
469429 2 0.2%
 
431202 2 0.2%
 
453277 1 0.1%
 
443625 1 0.1%
 
469653 1 0.1%
 
471704 1 0.1%
 
453274 1 0.1%
 
Other values (985) 985 98.5%
 
ValueCountFrequency (%) 
430104 1 0.1%
 
430141 1 0.1%
 
430232 1 0.1%
 
430380 1 0.1%
 
430567 1 0.1%
 
ValueCountFrequency (%) 
620962 1 0.1%
 
620869 1 0.1%
 
620819 1 0.1%
 
620757 1 0.1%
 
620737 1 0.1%
 

months_as_customer
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count391
Unique (%)39.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean203.954
Minimum0
Maximum479
Zeros1
Zeros (%)0.1%
Memory size7.9 KiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile28.9
Q1115.75
median199.5
Q3276.25
95-th percentile429.05
Maximum479
Range479
Interquartile range (IQR)160.5

Descriptive statistics

Standard deviation115.1131744
Coefficient of variation (CV)0.5644075352
Kurtosis-0.4854280674
Mean203.954
Median Absolute Deviation (MAD)94.449356
Skewness0.3621768478
Sum203954
Variance13251.04293
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[ 0. 77.5 299.5 479. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
194 8 0.8%
 
285 7 0.7%
 
140 7 0.7%
 
230 7 0.7%
 
128 7 0.7%
 
254 7 0.7%
 
101 7 0.7%
 
210 7 0.7%
 
156 6 0.6%
 
239 6 0.6%
 
Other values (381) 931 93.1%
 
ValueCountFrequency (%) 
0 1 0.1%
 
1 3 0.3%
 
2 2 0.2%
 
3 2 0.2%
 
4 3 0.3%
 
ValueCountFrequency (%) 
479 2 0.2%
 
478 2 0.2%
 
476 1 0.1%
 
475 2 0.2%
 
473 1 0.1%
 
Distinct count4
Unique (%)0.4%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
1
581
3
358
4
 
31
2
 
30
ValueCountFrequency (%) 
1 581 58.1%
 
3 358 35.8%
 
4 31 3.1%
 
2 30 3.0%
 

Composition

Contains charsFalse
Contains digitsTrue
Contains whitespaceFalse
Contains non-wordsFalse

Length

Max length1
Mean length1
Min length1
Scatter
Distinct count3
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
NO
343
?
343
YES
314
ValueCountFrequency (%) 
NO 343 34.3%
 
? 343 34.3%
 
YES 314 31.4%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceFalse
Contains non-wordsTrue

Length

Max length3
Mean length1.971
Min length1
Scatter

policy_annual_premium
Real number (ℝ≥0)

Distinct count991
Unique (%)99.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1256.40615
Minimum433.33
Maximum2047.59
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB
Mini histogram

Quantile statistics

Minimum433.33
5-th percentile855.112
Q11089.6075
median1257.2
Q31415.695
95-th percentile1653.4435
Maximum2047.59
Range1614.26
Interquartile range (IQR)326.0875

Descriptive statistics

Standard deviation244.167395
Coefficient of variation (CV)0.1943379495
Kurtosis0.07388944021
Mean1256.40615
Median Absolute Deviation (MAD)194.3547854
Skewness0.004401994527
Sum1256406.15
Variance59617.71676
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[ 433.33 764.655 964.855 1485.38 1632.015 1742.605 2047.59 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
1374.22 2 0.2%
 
1558.29 2 0.2%
 
1389.13 2 0.2%
 
1073.83 2 0.2%
 
1074.07 2 0.2%
 
1281.25 2 0.2%
 
1215.36 2 0.2%
 
1524.45 2 0.2%
 
1362.87 2 0.2%
 
1139 1 0.1%
 
Other values (981) 981 98.1%
 
ValueCountFrequency (%) 
433.33 1 0.1%
 
484.67 1 0.1%
 
538.17 1 0.1%
 
566.11 1 0.1%
 
617.11 1 0.1%
 
ValueCountFrequency (%) 
2047.59 1 0.1%
 
1969.63 1 0.1%
 
1935.85 1 0.1%
 
1927.87 1 0.1%
 
1922.84 1 0.1%
 

policy_bind_date
Categorical

TYPE DATE
HIGH CARDINALITY
Distinct count951
Unique (%)95.1%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
1992-08-05
 
3
2006-01-01
 
3
1992-04-28
 
3
1999-04-07
 
2
1999-09-29
 
2
Other values (946)
987
ValueCountFrequency (%) 
1992-08-05 3 0.3%
 
2006-01-01 3 0.3%
 
1992-04-28 3 0.3%
 
1999-04-07 2 0.2%
 
1999-09-29 2 0.2%
 
1997-07-14 2 0.2%
 
2004-01-03 2 0.2%
 
1998-11-11 2 0.2%
 
1993-08-30 2 0.2%
 
2000-05-04 2 0.2%
 
Other values (941) 977 97.7%
 

Composition

Contains charsFalse
Contains digitsTrue
Contains whitespaceFalse
Contains non-wordsTrue

Length

Max length10
Mean length10
Min length10
Scatter

policy_csl
Categorical

Distinct count3
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
250/500
351
100/300
349
500/1000
300
ValueCountFrequency (%) 
250/500 351 35.1%
 
100/300 349 34.9%
 
500/1000 300 30.0%
 

Composition

Contains charsFalse
Contains digitsTrue
Contains whitespaceFalse
Contains non-wordsTrue

Length

Max length8
Mean length7.3
Min length7
Scatter
Distinct count3
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
1000
351
500
342
2000
307
ValueCountFrequency (%) 
1000 351 35.1%
 
500 342 34.2%
 
2000 307 30.7%
 

Composition

Contains charsFalse
Contains digitsTrue
Contains whitespaceFalse
Contains non-wordsFalse

Length

Max length4
Mean length3.658
Min length3
Scatter

policy_number
Real number (ℝ≥0)

UNIQUE
Distinct count1000
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean546238.648
Minimum100804
Maximum999435
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB
Mini histogram

Quantile statistics

Minimum100804
5-th percentile143969.6
Q1335980.25
median533135
Q3759099.75
95-th percentile954279.1
Maximum999435
Range898631
Interquartile range (IQR)423119.5

Descriptive statistics

Standard deviation257063.0053
Coefficient of variation (CV)0.4706056706
Kurtosis-1.132637689
Mean546238.648
Median Absolute Deviation (MAD)220126.3448
Skewness0.03899064218
Sum546238648
Variance6.608138868e+10
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[100804. 999435.], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
116735 1 0.1%
 
107181 1 0.1%
 
430794 1 0.1%
 
115399 1 0.1%
 
328387 1 0.1%
 
824116 1 0.1%
 
492224 1 0.1%
 
663190 1 0.1%
 
936638 1 0.1%
 
193213 1 0.1%
 
Other values (990) 990 99.0%
 
ValueCountFrequency (%) 
100804 1 0.1%
 
101421 1 0.1%
 
104594 1 0.1%
 
106186 1 0.1%
 
106873 1 0.1%
 
ValueCountFrequency (%) 
999435 1 0.1%
 
998865 1 0.1%
 
998192 1 0.1%
 
996850 1 0.1%
 
996253 1 0.1%
 

policy_state
Categorical

Distinct count3
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
OH
352
IL
338
IN
310
ValueCountFrequency (%) 
OH 352 35.2%
 
IL 338 33.8%
 
IN 310 31.0%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceFalse
Contains non-wordsFalse

Length

Max length2
Mean length2
Min length2
Scatter

property_claim
Real number (ℝ≥0)

ZEROS
Distinct count626
Unique (%)62.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7399.57
Minimum0
Maximum23670
Zeros19
Zeros (%)1.9%
Memory size7.9 KiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile450
Q14445
median6750
Q310885
95-th percentile15540
Maximum23670
Range23670
Interquartile range (IQR)6440

Descriptive statistics

Standard deviation4824.726179
Coefficient of variation (CV)0.6520279122
Kurtosis-0.3763863117
Mean7399.57
Median Absolute Deviation (MAD)3879.07238
Skewness0.3781687764
Sum7399570
Variance23277982.7
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[0.000e+00 1.000e+01 2.450e+02 9.700e+02 1.520e+03 ... 4.695e+03 7.865e+03 1.557e+04 1.738e+04 2.367e+04], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 19 1.9%
 
860 6 0.6%
 
660 5 0.5%
 
480 5 0.5%
 
10000 5 0.5%
 
650 5 0.5%
 
11080 5 0.5%
 
640 5 0.5%
 
680 4 0.4%
 
5720 4 0.4%
 
Other values (616) 937 93.7%
 
ValueCountFrequency (%) 
0 19 1.9%
 
20 1 0.1%
 
240 1 0.1%
 
250 1 0.1%
 
260 1 0.1%
 
ValueCountFrequency (%) 
23670 1 0.1%
 
21810 1 0.1%
 
21630 1 0.1%
 
21580 1 0.1%
 
21240 1 0.1%
 

property_damage
Categorical

Distinct count3
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
?
360
NO
338
YES
302
ValueCountFrequency (%) 
? 360 36.0%
 
NO 338 33.8%
 
YES 302 30.2%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceFalse
Contains non-wordsTrue

Length

Max length3
Mean length1.942
Min length1
Scatter

total_claim_amount
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count763
Unique (%)76.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52761.94
Minimum100
Maximum114920
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB
Mini histogram

Quantile statistics

Minimum100
5-th percentile4320
Q141812.5
median58055
Q370592.5
95-th percentile88413
Maximum114920
Range114820
Interquartile range (IQR)28780

Descriptive statistics

Standard deviation26401.53319
Coefficient of variation (CV)0.5003897353
Kurtosis-0.4540814267
Mean52761.94
Median Absolute Deviation (MAD)20738.4732
Skewness-0.5945819885
Sum52761940
Variance697040954.8
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[1.0000e+02 2.5800e+03 7.4900e+03 9.0600e+03 2.7900e+04 ... 5.1015e+04 6.4910e+04 7.9885e+04 9.1585e+04 1.1492e+05], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
59400 5 0.5%
 
75400 4 0.4%
 
60600 4 0.4%
 
2640 4 0.4%
 
58500 4 0.4%
 
44200 4 0.4%
 
4320 4 0.4%
 
3190 4 0.4%
 
70290 4 0.4%
 
70400 4 0.4%
 
Other values (753) 959 95.9%
 
ValueCountFrequency (%) 
100 1 0.1%
 
1920 1 0.1%
 
2160 1 0.1%
 
2250 1 0.1%
 
2400 1 0.1%
 
ValueCountFrequency (%) 
114920 1 0.1%
 
112320 1 0.1%
 
108480 1 0.1%
 
108030 1 0.1%
 
107900 1 0.1%
 

umbrella_limit
Real number (ℝ)

ZEROS
Distinct count11
Unique (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1101000
Minimum-1000000
Maximum10000000
Zeros798
Zeros (%)79.8%
Memory size7.9 KiB
Mini histogram

Quantile statistics

Minimum-1000000
5-th percentile0
Q10
median0
Q30
95-th percentile6000000
Maximum10000000
Range11000000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation2297406.598
Coefficient of variation (CV)2.086654494
Kurtosis1.79207731
Mean1101000
Median Absolute Deviation (MAD)1761398
Skewness1.806712199
Sum1101000000
Variance5.278077077e+12
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[-1000000. -500000. 1000000. 2500000. 3500000. 7500000. 10000000.], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 798 79.8%
 
6000000 57 5.7%
 
5000000 46 4.6%
 
4000000 39 3.9%
 
7000000 29 2.9%
 
3000000 12 1.2%
 
8000000 8 0.8%
 
9000000 5 0.5%
 
2000000 3 0.3%
 
10000000 2 0.2%
 
ValueCountFrequency (%) 
-1000000 1 0.1%
 
0 798 79.8%
 
2000000 3 0.3%
 
3000000 12 1.2%
 
4000000 39 3.9%
 
ValueCountFrequency (%) 
10000000 2 0.2%
 
9000000 5 0.5%
 
8000000 8 0.8%
 
7000000 29 2.9%
 
6000000 57 5.7%
 

vehicle_claim
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count726
Unique (%)72.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37928.95
Minimum70
Maximum79560
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB
Mini histogram

Quantile statistics

Minimum70
5-th percentile3273.5
Q130292.5
median42100
Q350822.5
95-th percentile63094.5
Maximum79560
Range79490
Interquartile range (IQR)20530

Descriptive statistics

Standard deviation18886.25289
Coefficient of variation (CV)0.4979376675
Kurtosis-0.4465729231
Mean37928.95
Median Absolute Deviation (MAD)14924.5768
Skewness-0.6210979312
Sum37928950
Variance356690548.3
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[7.0000e+01 1.5600e+03 3.2850e+03 5.3750e+03 6.5200e+03 2.1470e+04 3.2015e+04 5.5635e+04 6.3845e+04 7.9560e+04], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
5040 7 0.7%
 
3360 6 0.6%
 
3600 5 0.5%
 
44800 5 0.5%
 
33600 5 0.5%
 
4720 5 0.5%
 
52080 5 0.5%
 
42720 4 0.4%
 
46800 4 0.4%
 
41760 4 0.4%
 
Other values (716) 950 95.0%
 
ValueCountFrequency (%) 
70 1 0.1%
 
1440 2 0.2%
 
1680 2 0.2%
 
1750 1 0.1%
 
1760 1 0.1%
 
ValueCountFrequency (%) 
79560 1 0.1%
 
77760 1 0.1%
 
77670 2 0.2%
 
76400 1 0.1%
 
76000 1 0.1%
 

witnesses
Categorical

Distinct count4
Unique (%)0.4%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
1
258
2
250
0
249
3
243
ValueCountFrequency (%) 
1 258 25.8%
 
2 250 25.0%
 
0 249 24.9%
 
3 243 24.3%
 

Composition

Contains charsFalse
Contains digitsTrue
Contains whitespaceFalse
Contains non-wordsFalse

Length

Max length1
Mean length1
Min length1
Scatter

Correlations

Missing values

Sample

First rows

_c39ageauthorities_contactedauto_makeauto_modelauto_yearbodily_injuriescapital-gainscapital-losscollision_typefraud_reportedincident_cityincident_dateincident_hour_of_the_dayincident_locationincident_severityincident_stateincident_typeinjury_claiminsured_education_levelinsured_hobbiesinsured_occupationinsured_relationshipinsured_sexinsured_zipmonths_as_customernumber_of_vehicles_involvedpolice_report_availablepolicy_annual_premiumpolicy_bind_datepolicy_cslpolicy_deductablepolicy_numberpolicy_stateproperty_claimproperty_damagetotal_claim_amountumbrella_limitvehicle_claimwitnesses
0NaN48PoliceSaab92x20041533000Side CollisionYColumbus2015-01-2559935 4th DriveMajor DamageSCSingle Vehicle Collision6510MDsleepingcraft-repairhusbandMALE4661323281YES1406.912014-10-17250/5001000521585OH13020YES716100520802
1NaN42PoliceMercedesE4002007000?YRiverwood2015-01-2186608 MLK HwyMinor DamageVAVehicle Theft780MDreadingmachine-op-inspctother-relativeMALE4681762281?1197.222006-06-27250/5002000342868IN780?5070500000035100
2NaN29PoliceDodgeRAM20072351000Rear CollisionNColumbus2015-02-2277121 Francis LaneMinor DamageNYMulti-vehicle Collision7700PhDboard-gamessalesown-childFEMALE4306321343NO1413.142000-09-06100/3002000687698OH3850NO346505000000231003
3NaN41PoliceChevroletTahoe2014148900-62400Front CollisionYArlington2015-01-1056956 Maple DriveMajor DamageOHSingle Vehicle Collision6340PhDboard-gamesarmed-forcesunmarriedFEMALE6081172561NO1415.741990-05-25250/5002000227811IL6340?634006000000507202
4NaN44NoneAccuraRSX2009066000-46000?NArlington2015-02-17203041 3rd AveMinor DamageNYVehicle Theft1300Associateboard-gamessalesunmarriedMALE6107062281NO1583.912014-06-06500/10001000367455IL650NO6500600000045501
5NaN39FireSaab952003000Rear CollisionYArlington2015-01-02198973 Washington StMajor DamageSCMulti-vehicle Collision6410PhDbungie-jumpingtech-supportunmarriedFEMALE4784562563NO1351.102006-10-12250/5001000104594OH6410NO641000512802
6NaN34PoliceNissanPathfinder201200-77000Front CollisionNSpringfield2015-01-1305846 Weaver DriveMinor DamageNYMulti-vehicle Collision21450PhDboard-gamesprof-specialtyhusbandMALE4417161373?1333.352000-06-04250/5001000413978IN7150?786500500500
7NaN37PoliceAudiA52015200Front CollisionNColumbus2015-02-27233525 3rd HwyTotal LossVAMulti-vehicle Collision9380Associatebase-jumpingtech-supportunmarriedMALE6031951653YES1137.031990-02-03100/3001000429027IL9380?515900328302
8NaN33PoliceToyotaCamry2012100Front CollisionNArlington2015-01-30214872 Rock RidgeTotal LossWVSingle Vehicle Collision2770PhDgolfother-serviceown-childFEMALE601734271YES1442.991997-02-05100/300500485665IL2770NO277000221601
9NaN42OtherSaab92x199620-39300Rear CollisionNHillsdale2015-01-05143066 Francis AveTotal LossNCSingle Vehicle Collision4700PhDcampingpriv-house-servwifeMALE6009832121?1315.682011-07-25100/300500636550IL4700NO423000329001

Last rows

_c39ageauthorities_contactedauto_makeauto_modelauto_yearbodily_injuriescapital-gainscapital-losscollision_typefraud_reportedincident_cityincident_dateincident_hour_of_the_dayincident_locationincident_severityincident_stateincident_typeinjury_claiminsured_education_levelinsured_hobbiesinsured_occupationinsured_relationshipinsured_sexinsured_zipmonths_as_customernumber_of_vehicles_involvedpolice_report_availablepolicy_annual_premiumpolicy_bind_datepolicy_cslpolicy_deductablepolicy_numberpolicy_stateproperty_claimproperty_damagetotal_claim_amountumbrella_limitvehicle_claimwitnesses
990NaN43FireJeepGrand Cherokee2013277500-32800Rear CollisionNNorthbrook2015-01-31184755 1st StMinor DamageNYSingle Vehicle Collision3810MDmoviesprof-specialtyunmarriedFEMALE4776442861YES1564.431994-02-05100/300500663190IL3810?342903000000266702
991NaN44OtherAccuraTL2002059400-32200Rear CollisionNRiverwood2015-02-06215312 Francis RidgeTotal LossWVSingle Vehicle Collision0MDbasketballother-serviceother-relativeMALE4339812571NO1280.882006-07-12100/3001000109392OH5220NO469800417601
992NaN26FireNissanPathfinder20101503000Front CollisionNSpringfield2015-01-2361705 Weaver StMajor DamageOHMulti-vehicle Collision3670MDcampingexec-managerialhusbandMALE433696943YES722.662007-10-24100/300500215278IN7340YES367000256902
993NaN28OtherVolkswagenPassat201200-32100Side CollisionNHillsdale2015-02-17201643 Washington HwyTotal LossOHMulti-vehicle Collision6020MDcampingexec-managerialhusbandMALE4435671243?1235.142001-12-08250/5001000674570OH6020?602000481601
994NaN30NoneHondaCivic199610-82100?NNorthbend2015-01-2266516 Solo DriveMinor DamageSCParked Car540High Schoolbungie-jumpingsalesown-childMALE4306651411YES1347.042007-03-24500/10001000681486IN1080?6480048602
995NaN38FireHondaAccord2006000Front CollisionNNorthbrook2015-02-22206045 Andromedia StMinor DamageNCSingle Vehicle Collision17440Masterspaintballcraft-repairunmarriedFEMALE43128931?1310.801991-07-16500/10001000941851OH8720YES872000610401
996NaN41FireVolkswagenPassat20152709000Rear CollisionNNorthbend2015-01-24233092 Texas DriveMajor DamageSCSingle Vehicle Collision18080PhDsleepingprof-specialtywifeFEMALE6081772851?1436.792014-01-05100/3001000186934IL18080YES1084800723203
997NaN34PoliceSuburuImpreza19962351000Side CollisionNArlington2015-01-2347629 5th StMinor DamageNCMulti-vehicle Collision7500Mastersbungie-jumpingarmed-forcesother-relativeFEMALE4427971303YES1383.492003-02-17250/500500918516OH7500?675003000000525003
998NaN62OtherAudiA51998000Rear CollisionNArlington2015-02-2626128 Elm LaneMajor DamageNYSingle Vehicle Collision5220Associatebase-jumpinghandlers-cleanerswifeMALE4417144581YES1356.922011-11-18500/10002000533940IL5220?469805000000365401
999NaN60PoliceMercedesE4002007000?NColumbus2015-02-2661416 Cherokee RidgeMinor DamageWVParked Car460AssociatekayakingsaleshusbandFEMALE6122604561?766.191996-11-11250/5001000556080OH920?5060036803